home *** CD-ROM | disk | FTP | other *** search
- Path: po.CWRU.Edu!mab22
- From: mab22@po.CWRU.Edu (Michael A. Balfour)
- Newsgroups: comp.lang.c
- Subject: Re: unique id for a string
- Date: 28 Feb 1996 18:56:06 GMT
- Organization: Case Western Reserve University, Cleveland, OH (USA)
- Message-ID: <4h28g6$p66@madeline.INS.CWRU.Edu>
- References: <1996Feb16.175601.114182@kuhub.cc.ukans.edu> <3129dd39.961626@news.iquest.net> <4gfpveINNe68@keats.ugrad.cs.ubc.ca> <4gih8a$b62@madeline.INS.CWRU.Edu> <4gsh3hINNf3b@keats.ugrad.cs.ubc.ca>
- Reply-To: mab22@po.CWRU.Edu (Michael A. Balfour)
- NNTP-Posting-Host: kanga.ins.cwru.edu
-
-
- In a previous article, c2a192@ugrad.cs.ubc.ca (Kazimir Kylheku) says:
-
- >In article <4gih8a$b62@madeline.INS.CWRU.Edu>,
- >Michael A. Balfour <mab22@po.CWRU.Edu> wrote:
- > >
- > >Actually, it's not *that* difficult to find. Try running gperf from GNU
- > >or reading the paper cited in my previous post in this subject. The
- > >results in the paper claim to be able to find a perfect hash function
- > >for 262,144 18-character words in 16.087 seconds on a Sun Sparc 2.
- > >Doesn't sound that difficult to me. :)
- > >
- > >>One way to get a unique mapping is to simply assign increasing numbers to the
- > >>200,000+ words, and use a hashing technique to quickly look up a word given its
- > >>integer tag, and look up the tag given the word. With the proper
- > >>represenatation, both operations can be done in "constant time".
- > >
- > >Once again, using a perfect minimal hash function will provide an easy
- > >method for accomplishing this.
- >
- >Definitely. But that assumes that the dictionary is not open to new members.
- >Otherwise you have to recompute the hashing function each time you add a
- >member. Does the paper discuss such ``loose ends''?
-
- There's no mention of how to handle adding new members, but you're
- certainly correct. You would want to generate a new hashing function
- each time you updated the dictionary. Of course, the way I'm using
- this algorithm in my own code is to dynamically create the hash function
- at runtime after reading in the desired list of words. So off-line
- updates to the dictionary wouldn't affect my code in the least.
-
- >
- >I am probably going to have a look at it, since perfect hashing functions are
- >interesting things. :)
-
- Once again, I heartily recommend this technique. I've had the chance to
- try running it now, and it works like magic! The paper can be found at
- ftp.cs.uq.edu.au in /pub/TECHREPORTS/departments/TR0242.ps.Z (I think)
- and the source code is in the same directory in the file
- perfes_10.tar.Z. Happy hunting!
-
- Mike Balfour
- --
- ----------------------------------+--------------------------------
- Mike Balfour, Partner | BS/MS Graduate - ECMP
- Overload Engineering | Case Western Reserve University
- "New Ideas for a Brighter Future" | Cleveland, OH
-